Maximum a posteriori adaptation for many-to-one eigenvoice conversion
نویسندگان
چکیده
Many-to-one eigenvoice conversion (EVC) allows the conversion from an arbitrary speaker’s voice into the pre-determined target speaker’s voice. In this method, a canonical eigenvoice Gaussian mixture model is effectively adapted to any source speaker using only a few utterances as the adaptation data. In this paper, we propose a many-to-one EVC based on maximum a posteriori (MAP) adaptation for further improving the robustness of the adaptation process to the amount of adaptation data. Results of objective and subjective evaluations demonstrate that the proposed method is the most effective among the other conventional many-to-one VC methods when using any amount of adaptation data (e.g., from 300 ms to 16 utterances).
منابع مشابه
Maximum a posteriori eigenvoice speaker adaptation for Korean connected digit recognition
In this paper, we present a maximum a posteriori (MAP) eigenvoice speaker adaptation approach to the self-adaptation system. The proposed MAP eigenvoice is developed by introducing a probability density model for the eigenvoice coefficients. And we make a self-adaptation system which is useful to public user, because user does not need to speak several sentences for adaptation. In self-adaptati...
متن کاملSpeaker Adaptation Techniques for Automatic Speech Recognition
Statistical speech recognition using continuousdensity hidden Markov models (CDHMMs) has yielded many practical applications. However, in general, mismatches between the training data and input data significantly degrade recognition accuracy. Various acoustic model adaptation techniques using a few input utterances have been employed to overcome this problem. In this article, we survey these ad...
متن کاملDiscriminative speaker adaptation with eigenvoices
Eigenvoice is an effective speaker adaptation approach and capable of balancing the performance and the requirement for a large amount of adaptation data. However, the conventional Maximum Likelihood Eigen-Decomposition (MLED) method in eigenvoice adaptation is based on Maximum Likelihood (ML) criterion and suffers from the unrealistic assumption made by HMM on speech process, so alternative sc...
متن کاملAcoustic Model Adaptation for Speech Recognition
Statistical speech recognition using continuous-density hidden Markov models (CDHMMs) has yielded many practical applications. However, in general, mismatches between the training data and input data significantly degrade recognition accuracy. Various acoustic model adaptation techniques using a few input utterances have been employed to overcome this problem. In this article, we survey these a...
متن کاملAn improved one-to-many eigenvoice conversion system
We have previously developed a one-to-many eigenvoice conversion (EVC) system enabling the conversion from a specific source speaker’s voice into an arbitrary target speaker’s voice. In this system, eigenvoice Gaussian mixture model (EV-GMM) is trained in advance with multiple parallel data sets composed of utterance pairs of the source and many pre-stored target speakers. The EV-GMM is effecti...
متن کامل